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ABSTRACT 



This review of the history of the measurement of 
reading comprehension follows the development of formal and informal 
reading comprehension tests dating from 1913 to the present. Upon 
reviewing the aspects, procedures, and criteria of these tests, the 
author noted that most of these tests resembled group verbal 
intelligence tests. Makeup of the tests was discussed to show how 
they were related to purposes for reading. It was felt that most 
tests had attempted to measure reading comprehension as a mere 
thought- getting process unrelated to reading purposes and that these 
tests limit the purposes for reading to the examinee's ability to 
achieve the test developer's purpose. A number of problems with 
regard to these tests were raised. It was concluded that instead of a 
continuous process of development, improvement, and increased 
knowledge in that area over the past 50 years, there had been merely 
a continuous rediscovery of old ideas and a continuous search for the 
elusive, definitive theory of reading comprehension which can serve 
as a basis for all measures of reading comprehension. Standardized 
reading tests which seem to measure reading comprehension are 
included in a table. Diagrams of various theories of reading 
comprehension and references are also included. (AH) 
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This paper is the result of an investigation of the history of the measure- 



ment of reading comprehension, undertaken to ascertain whether there has been a 



continuous process of development, improvement and increased knowledge in that 



area over the past fifty years or whether there has been merely a continuous 



re-discovery of old ideas and a continuous search for the elusive, definitive 
theory of reading which can serve as a basis for all measures of reading compre- 
hension. As will be shown, the results of my investigation suggest that the latter 
more accurately describes the situation. 
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It would be impossible to say exactly when the measurement of reading com- 
prehension began; for to answer that question, we would have to determine the 
first time a graphic symbol was interpreted by a "reader," who then demonstrated 
through subsequent behavior that he had understood that message. The question is 
obviously unanswerable. 

But we can hypothesize that from the very beginning the reader's comprehen- 
sion was determined in great part by his purpose for reading. Reading comprehen- 
sion, before the scientific advent of measuring instruments, was probably deter- 
mined by how well a reader achieved his purpose when using print as a medium. 

with the scientific advent of standardized measuring instruments, the search 
for the "psychological construct" we refer to as reading comprehension got underway 
and the reader* s purpose for reading was almost entirely forgotten. It is quite 
probable that early attempts to measure reading comprehension were based on prior 
development of intelligence measures. Early measures of reading comprehension 

^ The author wishes to thank Michael Smith for his assistance in the research 
for this study. 
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attempted to measure reading comprehension as if it were a general behavior that 
was the same under all conditions. 

The very earliest formal and informal measures of reading comprehension men- 
tioned in the published literature were based on how well a reader could reproduce 
what he had read. For example, in 1913 Pintner (17) reported a study which compared 
the oral and silent reading comprehension of fourth grade pupils. Pintner* s method 
of measurement was to ask each child after he had read **to write down as much as he 
could of the matter read.** In addition, Pintner *s method included not allowing the 
examinees to look back after they had completed reading. Pintner *s reproduction 
method is still employed today on the Silent Reading subtest of the Pur re 11 Analysis 
of Reading Difficulty (8). However, on the Durrell test the students only have to 
recite their tT memories** orally and do not have to write them. 

In 19lU, Brown (3) published three criteria for reading measurement: **Three 

things which must be accurately weighed in order to have a complete measure of 
reading power are* l) rate of reading, 2) quantity of reproduction, and 3) quality 
of reproduction.** The first reading comprehension tests published seem to meet the 
first and second of Brown* s criteria but not the third. 

It has been fairly well documented that the Gray Standardized Reading Para - 
graph s (l2), published in 1915>, was the first published reading test. However, 
this test included no measure of reading comprehension. Probably the first pub- 
lished reading comprehension measure was the Kansas Silent Reading Test (lU), de- 
vised by F. J. Kelly and published about 1916. 

The Kansas Reading Test resembles many of our group verbal intelligence tests 
of today. Two items from the list for grades 6, 7 , and^ will illustrate this 
points 

**'A farmer puts one-half the hay from his field into the first stack, then 
two-thirds of what is left into a second stack. Which stack is the largest?** 
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"Below are two squares and a circle. If the circle is the largest of the 
three, put a cross in it« If one square is smaller than the circle, put a 
cross in the large square. If both squares are smaller than the circle, put 
a cross in the small square," 

Reading tests today still bear a strong resemblance to group verbal intelli- 
gence tests. For example the Reading Comprehension section of the California 
Achievement Test (19) includes these items* 

"Good morning, little boy," said the policeman. "May I help you?" 

"1 am lost and I cannot find my way home," said Jack. "Please help me." 

lU. The policeman said, 

1) "Call your father," 

2) "I am in a hurry." 

3) "I will take you home." 



The sheep were playing in the woods and eating the grass. The wolf came to 

the woods. 

15>. Then the sheep 

1) went on eating. 

2) ran to the bam. 

3) ran to the wolf. 

Other early reading comprehension tests, all published by 1920, were The 
Courtis Silent Reading Test (7), Monroe r s Standardized Silent Reading Tests (16), 
the Haggerty Reading Examination (13 ) 9 and The Chapman Reading Comprehension Tgst 
(6), The Courtis Test was a timed test in which a pupil was given three minutes 
to read as much as he could of a two-page story. He was then given the same story 
but this time it was broken into a series of short paragraphs. A set of five 
yes-no questions followed each paragraph and the student was given five minutes to 
answer as many questions as he could. 

Monroe's test was a four-minute timed test in which the examinee was to read 
a series of paragraphs. Following each paragraph was a list of five words and the 
examinee was to underline the correct word according to information contained in 
the paragraph. An example from the test for grades 6, 7* and 8 follows* 
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Nanook, once so full of life, now knew perfectly well that it was all 
over with him. Head and tail down, the picture of resigned dejection, 
he stood like a petrified dog* Draw a line under the word which best 
describes the dog Nanook* angry - frightened - hungry - down-hearted* 

A procedure for measuring rate of reading comprehension very similar to 
Monroe* s test is still used today,. The Speed and Accuracy subtest of the Gates -* 
MacGinitie Reading Test (10) published in 1961 includes a series of paragraphs 
which are read by the examinee* Following each paragraph are four words, and the 
examinee selects the word which answers a question related to the main idea of the 
paragraph . For example t 

Eskimos learn to turn their kayaks ipside down in the water and right them 
again quickly. This maneuver requires courage as well as 
weapons hours skill ever 

The Apaciie Indians were nomads. They preferred following game to growing 
crops. They chose to fight rather than raise sheep. The Apaches were 
hunters sheepherders farmers cowards 

The Haggerty Reading Examination for grades to 12, published about 1918, 
included a vocabulary test, a sentence comprehension test, and a paragraph compre- 
hension test. The sentence comprehension test consisted of forty statements 
answered with a ir y eS,! or n no. n The paragraph comprehension test consisted of a 
series of seven paragraphs each followed by true and false statements. Timing was 
an important factor, as it was in all of these early tests. 

The Chapman Reading Comprehension Test for grades £ through 12, published in 
1920, included a series of paragraphs. The examineb was told that the second half 
of each paragraph included one word which spoiled the meaning of the paragraph and 
a line was to be drawn through that word. For example: 

!! The primary characteristic of a hero Is his sincerity: first and foremost 
he must believe in his cause. In the absence of such sincerity and without 
such belief, he will follow the straight path along which his hopes may be 
attained and his ambitions realized. 1 * 




In recent years, the Chapman procedure seems to be being rediscovered in 
slightly different form. The Stanford High School Heading Test (9) published in 
1965 and the Gntes-MacGinitie Reading Test published in 196U both utilize a pro- 



cedure in which examinees are to demonstrate their comprehension of a paragraph 
by supplying an appropriate word to complete a sentence. In both of these tests 
the word is to be chosen from a given set of choices. An example from the Stanford 
High School Reading Test follows • 

Just as a person 1 s family helps him and stands by him in case of need, 
so does his clan support him when he needs it 1 . This 2 may 

range from helping him collect the price of a b ride to protecting his 
3 should he incur the wrath of other clansmen bent on blood- 



vengeance 
1 



1 friendship) 

2 money 



2 5 relationship 
6 guidance 

3 1 wife 

2 possions 



3 aid 

It strength 

7 sharing 

8 support 

3 life 
U property 



The examples of early reading comprehension measures above demonstrate that 
the first two of Browns 19lU criteria were generally being adher 2 d to. Rate was 
certainly an important aspect of each of these tests and the measures were generally 
attempting to determine how well an examinee could understand a written communica- 
tion. However, the third cxiterion, quality of reproduction was neglected. By 
contrasting these early reading comprehension tests with our tests of today, we 
might wonder how far we have come in improving our measurement of the quality of 
reproduction. Perhaps this is because these first tests, and our tests of today, 
limit the purposes for reading to the examinees* ability to achie's : le test 

developer’s purpose* Indeed, reading comprehension was usually defined during the 
early *20*s as the process of thought- getting. **This thought-getti-;, may consist 
in the mere understanding of sentences or in the interpretation of paragraphs or 
whole selections, or it may be a combination of all these factors*' (Gilliland and 
Jordan, 11, p. 93). 
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The earliest reviews of these reading comprehension measures raised many 
questions about what a particular test was actually measuring. For example the 
following criticisms of various reading comprehension tests appeared in the Buro's 
1938 Mental Measurements Yearbook (U) : 

!, A valuable feature of the tests on reading comprehension is the effort to 
measure the pupils 1 ability to make inferences from the material read. However, 
portions of the tests may measure intelligence rather than reading ability” (p. 131, 
Joseph C. Derwey — on reviewing the Metropolitan Achievement Tests ( Reading ), copy- 
right, 1931). 

”Tbe tests have been validated by customary intercorrelation with other read- 
ing tests but more especial l y by selecting test situations which Represent the 
essential elements of the basic skills which are needed for success 1 in the work of 
the grades for which the battery is appropriate • • • • The comprehension tests 
require pupils (l) to follow directions, (2) to interpret meaning, and (3) to 
organize materials. In common with most comprehension tests, some of the items 
are open to criticism on the ground that they could be answered correctly by many 
pupils without having read the test paragraph on which they depend” (pp. 136-137$ 

Ivan A. Brooker - on reviewing the Progressive Reading Tests , copyright 193k). 

” Although the authors state that they seek to measure the ability of pupils 
to interpret what is read and to make inferences, it appears that the questions 
used for this purpose require the reproduction of facts stated directly in the 
reading text, rather than inferences made from these facts” (p. 137$ Joseph C. Dewey - 
on reviewing the Progressive Reading Tests , copyright 193U). ”However, if the 
Traxler tests on comprehension are broken do\m, it will be shown that comprehension 
of the paragraphs is measured by asking fort (a) details directly stated in the 
content, (b) details implied in the content (involving multiple-choice technique), 

(c) ‘yes* ‘no* answers, (d) total meanings, (e) central thought, etc. These tech- 
niques have long been proven good, but this application to the paragraphs in either 
O 
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of the forms is not consistent. Some paragraphs are followed by only one type of 
question, others by two or three, seemingly without plan. In addition, if the 
two forms are compared, the inconsistency is increased more than ever. Another 
way of putting this point is to inquire. What is paragraph comprehension? Is 
it something that is to be measured only through details in one instance, and a 
combination of several different techniques in another n (p. 139, Spencer Shank - 
6n reviewing the Traxler Silent Reading Test, copyright 193U). 

It is distressingly obvious that most of these questions are still being 
raised today. For example, the following quotes are from Buros Sixth Mental 
Measurements Yearbook (£) published in 196£: 

!, The first deficiency is the total lack of evidence regarding the factorial 
compositions of the reading tests. It is admitted that the tests measure a com- 
plex set of reading skills, but no evidence is forthcoming to support the conten- 
tion that the chosen * five major reading-for-comprehension skills* are major 
components of reading ability, or that the STEP reading tests do actually ! weight 
these five kinds of skills approximately equally. * All we know is that a committee 
of authorities agreed on this breakdown of reading into component skills. With 
due respect for the committee, it would be h 4 ghly desirable to have their judg- 
ments tested and supported by empirical evidence ,,, (p. 327, Paul Lohnes - on 
reviewing the Sequential Tests of Educational Progress : Reading , published in 

1963). 

”A useful technique is to attempt to answer reading comprehension items 
before reading the selection (I wish the publishers would stop calling their 
selections ’stories*)- On Form 1, grades 7-8-9, items £9-67, this reviewer 
answered correctly 8 out of the 9 questions about Switzerland without looking at 
the passage” (p. 33^, Clarence Derrick - on reviewing The Purvey of fading 
Achievement , published in 19£9). 
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,! The directions for administering and scoring are clear* The printing is 
good* Graphic orcfiles of subtest scores are printed on the front of each booklet. 
In fact, in most respects, the Developmental Rea ding Tests ! look ! like ^ell-made 
standardized tests and this is perhaps what is so insidious* The teacher follows 
the directions, the students mark the booklet, the tests are scored, and Johnny 
gets a reading grade of 1.9. What on earth does this mean?” (p. 29l|-295.'» Edward 
F r y-on reviewing the Developmental Reading Tests , published in 1961). 

The continuous search for the elusive answer to the question of what is read- 
ing comprehension probably encouraged the development of a vast multitude of reading 
comprehension measures. Indeed, the period from the 19U0 , s to today could be 
labelled the time of sub-skills proliferation. Many tests merely labelled the same 
sub-tests with different titles. Others had similar labels but employed different 
question types. I recently collected a list of all of the subtests from reacting 
tests which seemed to be attempting to measure reading comprehension. Most of these 
tests were published during the 19f>0 ! s and 60 ! s. My list includes the fifty sub- 
tests listed in Table 1. 
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Insert Table 1 about here 

What do all of the reading comprehension tests of today and those of the past 
seem to be measuring? In my opinion they have all attempted to measure reading 
comprehension as a "thought-getting process 11 which is generally unrela&ed to 
specific reading purposes. The series of diagrams in Table 2 illustrate the makeup 
of reading comprehension tests and how they are related to purposes for reading. 



Insert Table 2 about here 

The first diagram is an adaptation from a recent article by John Bormuth et al. 
(l). That diagram describes the usual procedure of having a set of questions 
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follow the reading of a selection* The solid lines in the diagram indicate a 
direct relationship while the broken line represents an implied relationship, ^or 
example, in the first diagram there is only an implied relationship between a 
student’s response to a question and his comprehension of a reading selection* The 
second diagram illustrates an item in which the examinees are able to identify 
correct response?* without reading a selection. It is amazing the vast number of 
questions on reading comprehension tests which exemplify this second diagram. 

The third diagram indicates the Cloze procedure in which a solid line estab- 
lishes the direct relationship of response to comprehension of a selection, The 
fourth describes the chunked test which has been developed by Carver and Darby. 

Each of these first four diagrams illustrates what I believe to be the basic 
problem with reading comprehension measures. The tests are being developed as if 
there was a well known theoretical construct called reading comprehension. Some 
of the most recent attempts at developing reading comprehension measures have 
emphasized that point. Schlessinger and Weiser (19) and Bormuth (2) have been 
vehement advocates of more systematic approaches to the development of test items. 
According to these authors what is needed are item development procedures which 
are based on the implicit language and organizational structure of a written 
message. Bormuth s in fact, concludes a recent study: 

"The most startling result was the fact that large proport ons of the children 
were unable to demonstrate a comprehension of even these basic structures by which 
information is signaled indicating that this deficiency may constitute a serious 
impediment to the efficiency of instruction. The structures identified seemed to 
represent homogeneous classes of behavior since the variation between questions 
measuring different skills was significantly greater than the variation between 
items measuring the same skill. The fact t > the structures and question types 
differed significantly in difficulty was also taken as evidence 'chat many of these 
skills may be hierarchiacally related 11 (l). 

O 
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The fifth diagram is the avenue which I believe may lead us off the track of 
trying to ascertain a general reading comprehension skill. This diagram, which is 
probably not a description of any existing reading test, includes a direct relation 
of purpose to the reading selection to a student 1 s response to a reading selection 
based on the given purpose. 

The approach that I am suggesting would not wait for tests to be developed 
until there is sound theoretical and empirical evidence concerning the components 
of reading ability as Kingston (l£) suggests. Instead we would begin to list, and 
organize reading purposes and tasks that lead adults and children to printed 
material. Measures would be based on these purposes and tasks. For example, we 
might include such tasks as having a child locate a phone number or determine from 
a newspaper when his favorite television program is scheduled. Certainly more 
complex purposes and tasks such as determining a characters motives to determine 
why he committed a certain act should be included. However, these purposes should 
be developed on the basis of meaningful reading tasks; they should be related to 
realistic purposes and should not be tasks that are derived from some authority 1 s 
arbitrary decision as to what a readers purpose should be. 

Conclusion 

It cannot ^>e denied that the sophistication of test developers and test 
reviewers has increased tremendously. They have definitely learned to ask more 
probing questions about the theoretical construct of reading comprehension; they 
have been able to provide more sophisticated technical data on reliability, 
validity ar.d ncrming procedures; the editing of test items has improved dramatical- 
ly, But the essential questions are still the same: we are still asking what read- 

ing comprehension is and how it should be measured* The following questions about 
reading comprehension tests were raised in 1910; were reiterated in 1938; and 
are still being asked today: 

0 
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1. Why _ls there such a great overlap between measures of hypothesized dif- 
ferent skills of reading comprehension? 

2. What format should reading comprehension measures take; multiple choice 
questions? cloze? fill in? long passages or short? should examinees 
be allowed to look back at a selection when answering questions? 

3. What are the sub-skills of reading comprehension? 

U. How strong an effect does prior knowledge of a topic have on an examinee* s 
reading comprehension? 

5. Does the language structure of a selection affect reading comprehension? 

I suggest that these and other similar questions result in exercises in 
futility. The only validity of any importance is how well a test predicts a 
student’s ability to perform functional reading tasks. Reading measures need to 
be developed which are based on specific reading tasks and purposes for reading. 

Just as we have been generally disillusioned with our attempts to measure 
intelligence as a psychological construct perhaps we should also be disillusioned 
with our attempts to measure reading comprehension as a psychological construct. 

The measurement of reading comprehension should be based on an attempt to determine 
how well a reader can accomplish a given task with a given reading selection. From 
my perspective, the history of the measurement of reading comprehension got started 
on a narrow, single track over fifty years ago and has been chugging around in 
circles ever since. That is not to s ay that increased sophistication in the 
technical, scientific, and even artistic aspects have been non-existent. Indeed, 
some of the advances in those aspects have been quite dramatic, but the essential 
problem is that the train has never switched off the Initial track, it has just 
had a streamlined engine attached. 
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Table 1 



*Subtests of Standardized Reading Tests 
Which Seem to Measure Reading Comprehension 



(Note: Not included are the myriad of study skills; also 

not included are the wide range of vocabulary subtests which 
might in several cases be included with this listing.) 

1. Context Reading 

2. Sentence and Word Meaning 

3. Paragraph Meaning 
4* Comprehension 

5. Level of Comprehension 

6. Speed of Comprehension 

7- Interpretation of Reading Materials 

8- Interpretation 

9- Organization 

10. General Comprehension 

11. Specific Comprehension 

12. Reading to Retain Information 

13. Reading to Organize 

14. Reading to Evaluate - Interpret 

15. Reading to Appreciate 

16. Perception of Relations 

17. General Information 

18. Ability to Grasp the Central Thought 

19. Ability to Not 2 Clearly Stated Details 

20. Interpretation 

21. Integration of Dispersed Ideas 

22. Ability to Draw Inferences 

23. Recalling Information 

24. Reading to Locate Information 

25. Reading for Description 

26. Rate of Reading for Meaning 

27. Reading for Directions or Procedures 

28. Sentence Completion 

29. Retention of Details 

30. Directed Reading 

31. Comprehension Accuracy 

32. Reading Efficiency 

33. Reading 

34. Noting Details 

35. Interpreting Paragraphs 

36. Following Directions 

37. Reading for Inferences 

38. Reading for Main Ideas 

39. Summarizing 
40* Skimming 
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41. Recall of Information Read 

42. Gross - Comprehension 

43. Comprehension - Efficiency 

44. Ability to Recall Ideas 

45. Ability to Translate Ideas and Make Inferences 

46. Ability to Analyze Motivation 

47. Ability to Analyze Presentation 

48. Ability to Criticize 

49. Reading for Information 

50. Story Comprehension 

* This list was developed from the test guide appendix in: 

Farr, Roger. Reading : What Can Be Measured . Newark, 

Delaware: International Reading Association, 1970. 



Table 2 



DESIGN: READING COMPREHENSION 



Stimulus 



Underlying Behavior 



Response 



1.* Direct measurement of item comprehension; implied measurement of passage 
comprehension. I I 



© 



Student reads 
a segment of 
language 



. Segment is 
_l ^ comprehended 






Control 




Student reads 


i 


Question 




over 


— > 


a relevant 


1 V 


is cotnpre- 
handed 




purpose 


question 


’ r 

1 





reading 

passage 



Answer 

is 

Derived 



Student 

responds 

overtly 



(T) In Bormuth's diagram, this line is solid. 

(2) This block does not appear in Bormutt^s diagram. 



I 



Poor Items: direct measurement of item comprehension; no measurement of 

passage comprehension. I 



Student reads 
a segment of 
language 



Lack of 



Student reads 



control ^Ja question 

over 
purpose 
for 

reading 
passage 



Segment may 
or may not be 
comprehended 



ion 



(Quest- — 
^jis compre- 



hended 




Student 

responds 

overtly 



^Adapted from Bormuth, J.R., Carr, J. 9 Manning, J. and Pearson, D„, "Children 1 
Comprehension of Between- and With in-Sentence Syntactic Structures, 1 * Journal of 
Educational Psychology , 61 (October 1970) p. 350. 
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may not, it tests literal recall. j 

f 



5. Comprehension related to purpose. 



Student compre- 


j 

1 


Student reads 


i 


Student de- 


1 

r 


Student 


hends purpose 


i 


and compre- 




rives answer 


t 


responds 


and accepts it 




bends passage 


^ 


related to 




1 


overtly 


j 


1 

1 H 






passage comp. 
1 


I 

l 


- 



• Prior learning is part of comprehension but not the sole 

>factor, i.e. purpose and background interact with the passage. 

v/ I 



Student compre- 
hends purpose 
and accepts it 


1 

1 

1 


Student reads 
and compre- 
hends passage 


j 





I 



Student derives 


j j Student 


answer related 


, \\ responds 


only to pui ose 

-j 


- )/\ * 

, over tly 

i 1 !— 



I 



I 
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